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Strategy evaluation schemes are a crucial factor in any agent-based market model, as they de- 
termine the agents' strategy preferences and consequently their behavioral pattern. This study 
investigates how the strategy evaluation schemes adopted by agents affect their performance in con- 
junction with the market circumstances. We observe the performance of three strategy evaluation 
schemes, the history-dependent wealth game, the trend-opposing minority game, and the trend- 
following majority game, in a stock market where the price is exogenously determined. The price is 
either directly adopted from the real stock market indices or generated with a Markov chain of order 
< 2. Each scheme's success is quantified by average wealth accumulated by the traders equipped 
with the scheme. The wealth game, as it learns from the history, shows relatively good performance 
unless the market is highly unpredictable. The majority game is successful in a trendy market 
dominated by long periods of sustained price increase or decrease. On the other hand, the minority 
game is suitable for a market with persistent zig-zag price patterns. We also discuss the consequence 
of implementing finite memory in the scoring processes of strategies. Our findings suggest under 
which market circumstances each evaluation scheme is appropriate for modeling the behavior of real 
market traders. 
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I. INTRODUCTION 

Bounded rationality [l[ is now widely accepted as a 
fundamental aspect of human decision-making process. 
Insufficient information and cognitive limitations force 
people to rely on lessons of experience rather than rea- 
son out the optimal solution. Hence, especially in the 
fields of behavioral economics and econophysics, tradi- 
tional assumptions of perfectly rational agents making 
a priori optimal decisions have been replaced by agents 
of limited cognitive capacity who follow rules of thumb 
empirically validated. An archetypical implementation 
of bounded rationality is found in the famous El Farol 
Bar problem Q and its simplified variant, the minority 
game In the minority game, played by multiple 

agents for multiple turns, each player opts for one of two 
choices at each turn and wins the turn if the player is 
on the minority side. Since it is impossible to predict the 
choice of the other players, an optimal choice simply does 
not exist. Instead, each player forms a set of strategies 
which predict the behavior of the other players and ad- 
vice the player which side to choose given the latest choice 
pattern of the other players. Using a strategy evaluation 
scheme, an agent compares the predictions of its strate- 
gies with the actual outcome of the game and evaluates 
the credibility of each strategy. At each turn, the advice 
of the most credible strategy is followed. In the stan- 
dard minority game, strategies are evaluated in terms of 
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their minority game score, i.e., how often they correctly 
land on the minority side. Different strategy evaluation 
schemes may be used in other models of bounded ratio- 
nality, but the structure of those models are essentially 
similar to the one outlined here. 

The minority game has been readily applied in mod- 
eling financial markets 

H S, @. Its boundedly rational 
agents choosing from strategies bear some resemblance to 
real market participants. The fluctuating choice pattern 
shows features reminiscent of stylized facts of financial 
markets [l4j. Phase transition between symmetric and 
asymmetric regimes gives clues as to how markets self- 
organize themselves to be marginally efficient (Til ]. Yet 
whether the minority game faithfully captures the be- 
havior of financial market speculators has been put under 
question 0, [H, Gil ■ One can observe that excess demand 
(supply) works to the advantage of sellers (buyers), thus 
favoring the minority, but this is merely a superficial sim- 
ilarity. Whether a strategy is successful is fully revealed 
only when an agent buys (short-sells) an item and sells 
(buys) it back later at a higher (lower) price. While the 
outcome of a single event determines the payoff in the mi- 
nority game, in real markets we need at least two events 
separated in time to determine the payoff. 

A variety of alternative models have been proposed 
to address this problem. One class of them assume 
that an agent buys (sells) an item and then immediately 
sells (buys) it back at the next time step. Marsili [10| 
showed that if agents expect price changes in the adja- 
cent time steps to be negatively correlated, the minor- 
ity game payoff is justified. But if agents expect the 
price changes to be positively correlated, they should use 
the majority game payoff to evaluate their strategies. In 
that case, the strategies opting for the majority deci- 
sion obtain higher scores. Since both expectations about 
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price behavior are equally justifiable, we are led to an 
alternative market model where agents of both minority 
and majority expectations coexist. Meanwhile, starting 
from the same buy-today-and-sell-tomorrow assumption, 
other studies derived the $-game payoff which 

can be roughly considered a time-delayed version of the 
majority game payoff [l3| . 

Another class of alternatives involves wealth, the to- 
tal value of cash and assets in an agent's possession. 
Besides being a natural measure of success, wealth can 
be updated at each time step without considering two 
temporally separated events. Furthermore, past buy or 
sell decisions continue to affect the way score changes, 
since wealth fluctuates according to the value of financial 
items accumulated through time. This is not the case for 
the minority game or the majority game, since the score 
change is dependent only on the decision made in the pre- 
vious turn. Hence agents evaluating their strategies us- 
ing wealth-based payoff are far more history-dependent, 
and their collective behavior often leads to quite regu- 
lar price behavior which may explain how different price 
trends are formed. Yeung et al. |17| performed a compre- 
hensive study on a market model incorporating wealth as 
the measure of both an agent's success and a strategy's 
credibility, which they termed the wealth game. 

All these different payoff schemes grew out of efforts 
to capture the characteristics of real financial markets 
more closely. Now it is natural to ask which ones are 
more relevant. Taking a Darwinian perspective, market 
participants are likely to use strategy evaluation schemes 
that are most beneficial for them, i.e., best at accumu- 
lating wealth . Thus we may let agents with different 
strategy evaluation schemes participate in a market, and 
compare their performance on the basis of wealth. For 
example, Andersen et al. [l2[ compared the wealth of the 
best minority game player with that of the worst in a 
market exclusively composed of minority game players, 
where the price is completely determined by the collec- 
tive behavior of the minority game players. They found 
that the best player was actually poorer than the worst 
player in terms of wealth. This inconsistency implies that 
minority mechanism cannot dominate the entire market 
for long. 

However, if minority game players account for only a 
small part of the market and are effectively decoupled 
from the price dynamics, there can be special situations 
when their strategy evaluation scheme proves profitable. 
In such cases, it is more suitable to follow the method- 
ology set out by Yeung et al. [l7|, which compares the 
average wealth achieved by different strategy evaluation 
schemes when the price data are exogenously given. Ye- 
ung et al. used price data taken from real financial mar- 
kets, such as the Hang Seng Index (HSI). Since real mar- 
ket data cannot be directly controlled and their com- 
plexity evades any simple quantitative description, if we 
experiment with such data, it is hard to draw any gen- 
eral conclusions about the relation between the market 
trend and the corresponding suitable behavior patterns 



of agents. In order to clarify the relation systematically, 
we use artificial price data generated with a Markov pro- 
cess characterized by at most two parameters. Since we 
can now try various kinds of price behavior controlled by 
as few parameters as possible, we hope our results are 
better established and have more general implications. 

This paper is organized as follows. First, we describe 
our model in Sec. HU To get some intuition about the 
problem, we compare the performance of different strat- 
egy evaluation schemes in real markets in Sec. IIIH using 
the Korea Composite Stock Price Index (KOSPI) and the 
HSI as price data. Conclusions drawn from this section 
are systematically verified in Sec. lIVi using artificial price 
data generated by the Markov process. While our studies 
mainly concern strategy evaluation schemes with infinite 
score memory, whether introducing finite score memory 
changes the result is discussed in Sec. [Vj Finally, the 
summary of our results is presented in Sec. IVII 

II. MODEL 

We use a variant of the original wealth game 
model 17], where the agents' buy or sell decisions have 
no influence on the price dynamics. This modification al- 
lows us to directly control the price dynamics, so that we 
can study the correlation between the price behavior and 
the profitability of each strategy evaluation scheme. Our 
"exogenized" model can be considered an approximation 
of the reality, if it concerns only a small fraction of the en- 
tire group of market participants. The situation is quite 
similar to the canonical ensemble in statistical mechanics, 
where we consider the temperature of the system a vari- 
able directly controlled by the external heat bath whose 
heat capacity is much larger than that of the system. 

Consider N agents participating in a stock market. As 
previously pointed out, they account for only a small part 
of the market. At time step t, agent i decides whether to 
buy or sell a unit of stock, or to abstain from trade. Agent 
i's each possible action is represented by ai (t) = ±1, 0, 
respectively. Agent i's position is the accumulation of the 
agent's past actions, written by 

t-i 

M*) = E a ^')- (!) 

t'=0 

This indicates the amount of stock the agent owns (if 
positive) or owes (if negative). Agent i's wealth is the 
sum of its cash Ci (t) and stock, 

Wl (t) = a (t) + ki (t) p (t) , (2) 

where P (t) is the price of a unit of stock. At each time 
step, the agent's action changes its cash by 

Ci(t + l) = Ci(t)-ai(t)P(t + l). (3) 

Therefore the agent's wealth is updated by the rule 

Wi (t + 1) = Wi (t) + h (t) [P(t + 1)-P (t)} . (4) 
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The agents share an m-bit market history, which 
records the price increase (denoted by 1) and decrease 
(denoted by 0) for the latest m time steps. Also each 
agent is provided with s randomly drawn strategies. A 
strategy is a mapping from the set of 2 m possible market 
histories to the set of the agent's three possible actions 
(buy, sell, or abstain), the total number of possible strate- 
gies being 3 2 '™. 

At every time step the agent updates the scores of 
its strategies according to a certain strategy evaluation 
scheme. The agent follows the suggestion of the highest- 
scoring strategy, although there is one exception to this 
rule. A suggestion that makes the agent's cash change 
sign from positive to negative, or decreases the cash that 
is already negative if followed is ignored and replaced 
with an abstention. This constraint, which we shall call 
the non-negative cash constraint, is implemented by in- 
troducing the position limitation 



Ki (t) = max 



(*) 



P{t) 



,0 



(5) 



so that any action increasing \k%\ — Ki when fej > Ki or 
ki < —Ki is forbidden. 

We consider three strategy evaluation schemes as clas- 
sified by Yeung et al. [TjJ The score of strategy a at time 
step t is denoted by u a (t). 

(i) Wealth game (WG): the score of a strategy is up- 
dated in a manner similar to the way an agent's 
wealth is updated. That is, the score of strategy a 
is updated by 

u a (t + 1) = u a (t) + k a (t) [P (t + 1) - P (<)] (6) 

where k a (t) = X)t'=o 0,(7 (^') * s ^ ne virtual position 
of strategy cr. Now strategies are also subject to the 
non- negative cash constraint: any action a a (t) of 
the strategy that makes the cash part of the strat- 
egy's score change sign from positive to negative, or 
decreases the cash part that is already negative, is 
replaced with an abstention. Note that while agent 
i's wealth and its strategy cr's score are updated in 
the same manner, they are not equal, since agent 
i's action ai (t) and position ki (t) are different from 
strategy cr's suggested action a a it) and virtual po- 
sition k a (t). We expect this position dependence to 
grant WG the strongest history dependence among 
the three schemes under consideration. 

(ii) Minority game (MinG): the score updating rule of 
MinG is given by 



U a (t + 1) = u a (t) - a a (t) [P(t + 1)-P (t)} 



(7) 



Trend-opposing strategies are favorably scored, 
as buying (selling) increases the score when the 
price falls (rises). The buying (selling) party can 
be roughly considered the minority in a bearish 
(bullish) market, so we can say MinG favors the 



minority. Hence the name of the scheme, despite 
the fact that the agents do not necessarily have to 
be in the minority in the exact sense of the term to 
gain profits. As pointed out by Marsili [lJJ], users of 
this scheme believe that the immediate price trend 
will be reversed soon. Also they are short-sighted 
(or eager to forget about the past) in the sense that 
only the action of the previous turn, rather than the 
position, affects the change of score. 

(iii) Majority game (MajG): the score updating rule of 
MajG is given by 



u a (t + 1) = u a (t) + a a (t) [P(t + l)-P (*)] 



(8) 



Trend-following strategies are favored, as buying 
(selling) increases the score when the price rises 
(falls). Just in the sense that MinG favors the mi- 
nority, MajG favors the majority. MajG users ex- 
pect that the current market trend would be sus- 
tained [TJ|. 



III. PERFORMANCE IN REAL MARKETS 

We start by comparing the performance of the three 
strategy evaluation schemes in real financial markets. We 
use the closing price data of the HSI from December 31, 
1986 to June 10, 2009 and the KOSPI from July 1, 1997 
to June 10, 2009 [H. 



A. Adaptation to market trends 

One measure of the agents' overall adaptation to the 
market trends is the number of strategy-switching agents. 
A large number of strategy-switching agents indicate that 
the agents are actively responding to changes in the mar- 
ket trend. As Figs. HIa)-[]Tc) snow J the peaks of the 
number of strategy switchers decrease in size, i.e., later 
trend changes do not induce as much response from the 
agents as early trend changes do. Cumulative scoring of 
each strategy means that the score gap between strategies 
will broaden over time, which makes agents more reluc- 
tant to abandon previously successful strategies. Thus, 
the agents increasingly settle on one strategy. 

Decay of peaks is the most rapid for WG. Position de- 
pendence of the WG wealth update rule quickly broadens 
the score gap between successful and unprofitable strate- 
gies, and this gap is not easily reduced unless a new mar- 
ket trend persists for a sufficiently long period. Thus WG 
agents tend to settle their preferred strategies early on 
and stick to it as long as possible. MinG and MajG agents 
show similar behavior, but their strategy preferences are 
not as clear as those of WG agents. Hence, MinG and 
MajG agents are more sensitive to trend changes. 
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FIG. 1: (Color online) The performance of the three strategy evaluation schemes are observed when the price is exogenously 
given by the real stock market indices: the HSI closing prices from December 31, 1986 to May 25, 2010 and the KOSPI closing 
prices from July 1, 1997 to May 25, 2010. Other market parameters are given by N = 10000, m = 2, s — 2, and the initial 
wealth of each agent is set equal to five times the initial stock price to encourage activity. The number of strategy-switching 
agents is shown in the cases of (a) WG, (b) MinG, and (c) MajG. The average wealth gained by each strategy evaluation 
scheme is shown for both (d) HSI and (e) KOSPI. For reference, the stock market indices are shown along in black. 



B. Effect of trend changes on agents' wealth 

In the previous subsection, we observed that WG 
agents quickly adapt to the initial market trend. They 
can optimize their positions for the initial trend and ef- 
ficiently accumulate wealth. But since positions can be 
changed only by one at a time, they are very vulnera- 
ble to sudden trend changes. This is confirmed by the 
average wealth curves shown in Figs. QJd) andQJe), es- 
pecially the KOSPI during the 1997 Asian financial crisis. 
While WG agents quickly realized the benefits of short- 
selling and prospered during the initial market crash, 
their advantage turned into a trap when the market be- 
gan to "recover" in 1998. Combination of large negative 
positions (built up by short-selling) and positive price 
changes meant WG agents were particularly hard hit by 
this sudden trend reversal. It took some time for WG 
agents to switch their strategies and reverse the sign of 
their positions, and for a while their average wealth was 
the lowest among the three schemes. This observation 
shows that while wealth is a convenient measure of suc- 
cess, it has its own shortcomings. An agent cannot liqui- 
date its own assets all at once, so if a large portion of the 
agent's wealth comes from assets, the amount of wealth 
is largely at the mercy of price fluctuations. 

MajG agents suffer similar difficulties in 1998, but their 
adaptation to the initial trend was not as complete as 
that of WG agents, i.e., their positions were not suffi- 
ciently negative. Hence, their initial wealth gain on av- 
erage was less than that of WG agents, but so was their 
loss due to the trend reversal. 

On the other hand, MinG agents are always least af- 



fected by trend reversals. They always try to move 
against the market trend, and consequently their ac- 
tions are severely restricted by the position constraint 
in Eq. (O, limiting their positions to near-zero region. 
Hence MinG agents may temporarily attain the highest 
average wealth after some trend changes. But they can- 
not keep the lead for long if the new trend turns out to 
be stable, as in the KOSPI where WG eventually catches 
up with MinG. 

C. Initial trend significance 

Agents are most flexible in the initial stage and grow 
increasingly conservative over time. As time passes, 
the gaps between the scores of strategies broaden and 
wealth gained or lost in the initial phase limits the free- 
dom of the agents, both making it harder for agents to 
switch to a different strategy. Figure [Tfd) shows that 
MinG agents are the second most successful in the HSI 
despite the general increase of price. This is because 
the initial trend, somewhat periodical increase and de- 
crease of price, worked to MinG agents' advantage. This 
helped MinG agents attain good position and high aver- 
age wealth early on, and MajG never overcame this ini- 
tial disparity. A similar observation holds for the KOSPI, 
where WG was never able to make up for the advantage 
of MajG formed by the trend reversal in 1998. 

This initial trend dependence may be seen as a weak- 
ness of the strategy evaluation schemes considered in this 
study. But while real markets show a complex mixture of 
heterogeneous trends, the strategy evaluation schemes are 
designed as a simplified model of the behavior of market 
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participants. The extent to which those simplified agents 
can cope with ever-changing market trends is bound to 
be limited, and hence the lingering influence of the initial 
trend. Traders in real markets are certainly more adap- 
tive than those model agents and initial trends would be 
less crucial for them. This is yet another reason why 
we experiment with artificially generated prices in the 
next section, so that the complexity of market trends is 
reduced and initial trend dependence becomes less signif- 
icant. 

In summary, WG agents quickly adapt to the initial 
market trend and draw the most profits from the trend. 
At the same time, however, WG agents suffer the most 
from sudden trend changes, while MG agents are least 
swayed by them. In addition, the initial trend is sig- 
nificant, since it affects the strategy preference and the 
freedom of choice for later periods. 



IV. PERFORMANCE IN ARTIFICIAL 
MARKETS 

A. Simulation settings 
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FIG. 2: (Color online) The sample mean of the average wealth 
gained by each strategy evaluation scheme at the 1000th time 
step, measured for various values of the probability of price 
increase p-f and averaged over 1000 samples. The other pa- 
rameters are given by N = 10000, m — 2 and s = 2. The 
error bars indicate standard deviation of the average wealth 
from the sample mean. Note that the lines are guide to the 
eyes. 



To clarify the relationship between the price pattern 
and the performance of strategy evaluation schemes re- 
mains, we extend our study to include artificially gen- 
erated prices whose behavior can be described by a few 
parameters. Let p^{(J-) be the probability that the price 
will increase in the next time step given the latest m-bit 
market history ji. At each time step the price can in- 
crease or decrease by 1. Starting from the random initial 
/i and the initial price P(0) = 1000, we generate all the 
subsequent price data. 

B. History-independent price behavior 

Consider the case when the price data are generated us- 
ing only a single probability p^, the history- independent 
probability of price increase. In other words, the price 
dynamics is a biased random walk. If p^ is sufficiently 
larger or smaller than 1/2, the price reliably increases or 
decreases from the beginning to the end. Ifp-f — 1/2, the 
dynamics gets close to an unbiased random walk and the 
price behavior becomes unpredictable. 

Figure [2] shows the average wealth of agents for each 
strategy evaluation scheme at the 1000th time step, for 
each value of pf. As the market becomes more pre- 
dictable (p-f farther away from 1/2), WG and MajG 
are similarly more successful than MinG. Higher pre- 
dictability means that WG agents' lessons from history 
are more useful. Since there is only one probability 
involved, higher predictability is synonymous with sta- 
bility of trends, a favorable condition for characteristic 
MajG agents. On the other hand, no strategy evaluation 
scheme is significantly better than the others when the 
market is unpredictable. 



Note that the WG and MajG curves in Fig.[2]are highly 
asymmetric. These curves indicate that agents are better 
off when p^ is low rather than high, which is a natural 
consequence of our model. When p^ is high, agents ac- 
cumulate wealth by maintaining large positive positions. 
But as the agents run out of cash, the position constraint 
Eq. ([5]) prevents them from buying stock any further. 
Hence the average wealth increases only at a limited rate. 
When p-f is low, successful agents quickly build up large 
negative positions. Such an agent's wealth has positive 
contribution from cash and negative contribution from 
the amount of sold stock. While the amount of cash 
keeps increasing, the fraction of stock in the agent's total 
wealth continues to decrease as the price falls. 

Therefore, there is effectively no lower bound on the 
negative position, which leads to ever-accelerating wealth 
gain of agents. This result is against our intuition that 
bullish market is more profitable than bearish market, 
but it should be recalled that we are considering only 
a small fraction of the entire market participants with 
negligible influence on the price dynamics, who can freely 
buy or sell. Under realistic circumstances, agents cannot 
keep selling stock and build negative position, since they 
have trouble finding buyers. 

Note that this asymmetry is also reflected in the real 
market simulation results shown in Fig.Q] Since the over- 
all increment of the HSI is larger than that of the KOSPI 
during the observation period, the eventual ratio of the 
agents' average wealth to the stock index is much larger 
in case of the KOSPI than the HSI. 
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FIG. 3: (Color online) Insets show in purple examples of price 
data generated by different values of pt and ps- Higher values 
of pl indicate greater likelihood for long-term trends, and 
higher values of ps indicate longer zig-zag oscillations. 



C. Price generated by two-bit market history 

Now we generate the price data using four probability 
parameters, p<|- (||)j Pt(lt)i Pt (t4-)j an d Pf(tt)- Each 
parameter corresponds to the probability of price increase 
given the latest two-bit market history, where the di- 
rection of each arrow represents the direction of price 
change. In this case, we can say the price is generated 
by a Markov chain of order two. 

Previous simulations show that long-term price in- 
crease or decrease favors WG and MajG over MinG. Now 
we shall only consider the cases when general price in- 
crease or decrease from the beginning to the end is sup- 
pressed. For this we introduce two new constraints, 



Pt (||) + p t (tt) = Pt (It) + Pt (tl) = 1. 



(9) 



This reduces the number of parameters to two. We 
define the long-term parameter pl and the short-term 
parameter ps by 



PL =p f (ft) - 0.5 = 0.5 -p t (||) : 
Ps =p t (t|)-0.5 = 0.5-p t (|t). 



(10) 



If Pi > 0, price is likely to increase or decrease for 
three consecutive time steps or longer. If pl < 0, price 
increase or decrease is not likely to continue for more than 
two time steps. Thus pl controls the likelihood of long- 
term price trends. Meanwhile, if ps > 0, price is likely 
to show sustained zig-zag oscillations of period two. If 
Ps < 0, price increase or decrease is likely to continue 



for at least two time steps. Hence ps controls how rapid 
price oscillations are likely to be. 

Examples of generated price data for different values of 
Pl and ps are shown in Fig.[3j Figure|4]visualizes relative 
performance of the three evaluation schemes, in terms 
of the average wealth and the chance of achieving the 
highest average wealth. We can simplify our observations 
in terms of four extreme cases. 

(i) pl > and ps > 

The price behavior is dominated by long-term 
trends with intermittent zig-zag oscillations, as il- 
lustrated by the pl =0.4 and ps — 0.4 case shown 
in Fig. [3J WG and MajG outperform MinG since 
trend-following strategies are more suitable for a 
trendy market. However, zig-zag oscillations are 
also likely to persist, which is disadvantageous for 
MajG. Hence, WG tends to be more successful than 
MajG. 

(ii) pl > and ps < 

The price behavior is completely dominated by 
long-term trends without zig-zag oscillations, as il- 
lustrated by the pl = 0.4 and ps = —0.4 case 
shown in Fig. [3J WG and MajG are more suc- 
cessful than MinG, with MajG closely in the lead. 
It is not obvious why MajG should be better than 
WG, but note that there are abrupt trend reversals 
in the price pattern. As pointed out in Sec. IIII1 
WG is particularly prone to initial trend reversals. 
Thus WG agents start with a slight disadvantage, 
which they find difficult to make up for later even 
if they eventually adapt themselves to the market. 

(iii) pl < and ps > 

Long-term trends are suppressed and zig-zag os- 
cillations of period 2 (till • • •> f° r example) be- 
come dominant, as illustrated by the pl = —0.4 
and ps = 0.4 case shown in Fig. [31 MinG is the 
most successful since its fundamentalistic expecta- 
tions turn out to be correct. WG still adapts well 
to the price behavior and its average wealth is only 
slightly less than that of MinG. MajG gets the low- 
est average wealth due to its chartistic nature. 

(iv) pl < and ps < 

Zig-zag oscillations of period 4 (till ■ ■ •, f° r ex- 
ample) become dominant, as illustrated by the 
Pl = —0.4 and ps = —0.4 case shown in Fig. [3J 
This condition is still more favorable for MinG than 
for MajG, but the advantage of MinG is not as 
strong as in the case ps > 0. Thus, WG outper- 
forms MinG. 

It should be noted that although MajG tends to show 
the worst performance for pl < 0, there is an exception 
when ps — 0. Whenever ps is far from zero while pl < 0, 
the price dynamics is dominated by regular zig-zag price 
oscillations. But if ps — 0, regularity of the oscillations 
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FIG. 4: (Color online) Diagrams showing performance of strategy evaluation schemes for different values of pl and ps, with 
memory size m varied. The upper two diagrams indicate the average wealth of agents using each strategy evaluation scheme, 
for (a) m = 2 and (b) m = 3. The lower two diagrams represent the chance of each evaluation scheme achieving the highest 
average wealth, for (c) m = 2 and (d) m = 3. Other parameters are given by N = 10000 and s = 2, with the average wealth 
measured at the 5000th time step. As the ternary plot on the right shows, superior performance is indicated by greater weight 
put on the scheme's representative color. We use the average wealth of agents and the chance of attaining the highest average 
wealth to visualize the relative performance of a strategy evaluation scheme. Note that the region where MinG or MajG is 
successful increases as m is increased from 2 to 4. We observed the same trend when m is further increased to 10. 



is broken, giving some chance for MajG while impairing 
the performance of MinG. 

Even if we change the values of the number of agents N 
and the length of market history m to check the stability 
of our results, qualitatively the same results as explained 
above are observed. Interestingly, the area of the pl - 
ps diagram in which MinG or MajG is the most suc- 
cessful increases as m is increased, as shown in Fig. [4] 
This is not because greater memory m enhances the per- 
formance of MajG and MinG. Rather, all three schemes 
tend to show worse performance for greater ro, with the 
impairment being the severest for WG. We suspect that 
when m > 2, agents are trying hard to find some spu- 
rious causal relations between market history and price 
behavior, which can cause undesirable inefficiency. 

The price dynamics is dependent only upon the latest 
two time steps, but agents are considering longer time 
spans to make their decisions. Thus they are likely to 
make false conclusions about the market trends, which 
deteriorates their average wealth. Also greater m leads 
to rapidly growing number of possible strategies (= 3 2 ), 
but we have fixed the number of strategies available for 
each agent to two. This further hinders the agents from 
making correct decisions, since only a few agents would 
be given strategies suitable for the market trends. Since 
WG has the strongest dependence on history, they suffer 



the heaviest loss from these problems. 

In most cases, WG manages to be the most successful 
scheme, otherwise closely follows the scheme of the high- 
est average wealth. Although they may suffer temporar- 
ily from trend changes, eventually they learn to make up 
for their loss as long as the market shows sufficient pre- 
dictability. Even if the market totally lacks predictabil- 
ity, all schemes show similar performance, so even in such 
cases we cannot say WG is worse than other schemes. 
We can say WG is the most "versatile" among the three 
schemes. 

With this concept of Markov chain, we are able to mea- 
sure the probabilities of each movement p\ (4-4-), Pt (4-t)> 
Pt (14-), and Pt (tt) f° r the real stock market data, in ret- 
rospect. If the probabilities are measured for each con- 
secutive two-step and averaged for the entire observation 
period, it turns out that for the HSI, p-f (4-4-) = 0.52, 
Pt (It) = 0.53, p t (U) = 0-50, and p t (tt) = 0.53. Sim- 
ilar values are observed for the KOSPI with p^ (4-4-) = 
0.51, p t (4-t) = 0.54, p t (tt) = 0.53, and p t (tt) = 0.56. 
Note that the previous constraints in Eq. © do not 
exactly hold in real data [20j . However, all the prob- 
abilities are slightly above 0.5, pf (tt) ^ Pt (tt), and 
Pt (It) ^ Pt (tl) , which may give slight overall advan- 
tage to WG and MajG according to Fig. 0|a) (m = 2 
case), reasonably consistent with our observation. 
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V. FINITE SCORE MEMORY 

Thus far, we have assumed that the scores of the strate- 
gies have infinite memory. That is, a strategy's sugges- 
tions made in the past and at present equally contribute 
to its score. As time passes, the sheer size of the past 
experience overwhelms contributions from the recent ex- 
perience. Thus agents grow increasingly unresponsive to 
trend changes, as the decrease of strategy switchers in 
Figs. (Ua)-rijc) clearly indicates. For the agents to stay 
highly adaptive to market changes, our model should im- 
plement finite score memory so that the emphasis is put 
upon the "present state of affairs." 

Finite score memory can be implemented by evaluat- 
ing the performance of a strategy in terms of the score 
accumulated over the latest T time steps [U [22[. If 
u a (t) denotes the infinite-memory score of a strategy as 
originally defined by one of the three evaluation schemes 
explained in Sec. [TTl we define the corresponding finite- 
memory score (t) by 

u T a (t) = u a (t) - u a (t - T) . (11) 

It is straightforward to apply the above formula to MinG, 
MajG, and WG. For convenience, let these finite-score- 
memory variants of the original three strategy evalua- 
tion schemes be called the delta minority game (DMinG), 
the delta majority game (DMajG), and the delta wealth 
game (DWG). 

A note of caution is in order in the case of DWG. 
The virtual wealth u a (t) should obey the non-negative 
cash constraint explained in Sec. [Ill if u & (t) 1S to be 
interpreted as the change of virtual wealth. Since the 
score (t) is indirectly subject to the constraint through 
u a (<), the DWG score is still under the influence of the 
entire history. In this sense, the score memory of DWG 
is not truly finite. 

A. Real markets 

Figure[5]shows the average wealth achieved by DMinG, 
DMajG, and DWG in real markets (KOSPI and HSI) on 
the last day of the period of observation, for different 
score memory sizes T . The performance of each strategy 
evaluation scheme strongly depends on the score memory 
size, especially in the case of KOSPI where the overall in- 
crease of price is much less dominant. While it is hard 
to find any general tendencies in the fluctuating aver- 
age wealth curves, they do suggest that the real market 
behavior is a complex mixture of market trends with dif- 
ferent time scales. 



B. Artificial markets 

Let us first consider the performance of WG, DMinG, 
and DMajG in an artificial market where the price is 



generated by biased random walk. We can observe that 
finite score memory does not make much difference in this 
case: DMajG exhibits the worst performance among the 
three except for the unpredictable case p^ ~ 0.5, while 
DWG and DMajG are almost equally successful, in a 
manner very similar to the result shown in Fig. [3] Finite 
score memory is intended to improve the performance of 
agents when the market trends change from time to time, 
but biased random walk generates very stable market 
trends. Thus score memory is not so relevant in this 
case. 

If the price is generated by a Markov chain of order 2 
subject to the constraint Eq. ([5]), then finite score mem- 
ory causes some visible changes, as shown in Fig. [5] Fig- 
ures E^a)-[6fc) show the relative performances of DMinG, 
DMajG, and the original WG. The original WG, whose 
score memory is fixed to be infinite, is unaffected by the 
score memory size T . Thus Figs. [HIa)~lHI c ) show how 
the absolute performances of DMinG and DMajG are af- 
fected by score memory. Note that the performances of 
DMinG and DMajG are worse than their infinite score 
memory counterparts, as indicaed by the greater area 
in the pl - Ps diagram dominated by the original WG. 
As the score memory is increased, the shape of the pl 
- ps diagram approaches the infinite score memory re- 
sult shown in Fig. [4) If we compare the performances 
of DMinG, DMajG, and DWG as shown in Figs. WAY 
[HJf ) , we observe that the performance of DWG is worse 
than that of the original WG, but again approaches the 
original infinite score memory result as T becomes large. 

Our observations indicate that finite score memory 
does not improve the agents' performance if the price 
is generated by a Markov process of order < 2, while 
keeping the qualitative features of the observations made 
in the infinite score memory cases. The price series thus 
generated exhibit simple and steady market trends, while 
finite score memory is intended for more complex mar- 
ket trends. If the market trend is steady, agents would 
better recognize the trend if they observe it for a suffi- 
ciently long time rather than throw away their distant 
past experiences. 



VI. SUMMARY AND CONCLUSIONS 

We have studied average performances of three strat- 
egy evaluation schemes, WG, MinG, and MajG, in a mar- 
ket whose price dynamics is exogenously determined by 
real market data or the Markov process. We believe that 
our simplified version of WG effectively captures the be- 
havior of a small portion of traders in a large stock mar- 
ket, and helps determine which evaluation scheme is use- 
ful for individual traders given a particular market sit- 
uation. We can expect that individuals whose influence 
on the market is negligible will use tests similar to the 
one presented in this paper, and choose the most suc- 
cessful scheme to evaluate their strategies. Thus we can 
tell which scheme most suitably describes the behavior 
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FIG. 5: (Color online) The data points represent the average wealth gained by each strategy evaluation scheme shown for 
different sizes of score memory T, while the dashed lines correspond to the average wealth obtained by the original infinite- 
score-memory evaluation schemes. The average wealth is measured at the end of the observation period, which is (a) from 
December 31, 1986 to May 25, 2010 for the HSI closing prices and (b) from July 1, 1997 to May 25, 2010 for the KOSPI closing 
prices. Parameters other than the score memory are given by TV = 10000, m — 2, s = 2, and an agent's initial wealth is set 
equal to five times the initial price. 
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FIG. 6: (Color online) Each strategy evaluation scheme's chance of reaching the highest average wealth are shown for different 
values of the score memory size T, in an artificial market whose price is generated by a Markov chain of order 2. Refer to the 
ternary plot in Fig. [4] for the meaning of each color. The upper three diagrams compare the performance of the original WG, 
DMinG, and DMajG for (a) T = 10, (b) T = 100, and (c) T = 1000. The lower three diagrams compare the performance of 
DWG, DMinG, and DMajG for (d) T = 10, (e) T = 100, and (f) T = 1000. Other parameters are given by N = 10000, s = 2, 
m — 2 with the average wealth measured at the 5000th time step. 



of agents under different market circumstances. While 
incorporation of finite score memory does make a differ- 
ence in real markets, the basic conclusions drawn from 
the artificial market observations are qualitatively still 
valid. 

Note that MajG is most successful when the price pat- 
tern is completely dominated by long-term trends, while 
MinG is the best choice if the price tends to show rapid 
oscillations. In other words, these schemes prosper when 
their expectations are fulfilled. Combined with Marsili's 



observation [lOj that the price behavior follows the expec- 
tations of either MinG or MajG depending on which side 
is more dominant in the market, we get an idea of how 
certain price patterns maintain themselves through posi- 
tive feedback. For instance, a bubble can maintain itself 
because long-term trends make MajG traders a domi- 
nant force in the market and MajG traders' expectations 
fulfill themselves. Whether we can model some negative 
counterpart of this feedback mechanism would be an in- 
teresting issue for further studies. 
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We used average wealth to assess the viability of each 
evaluation scheme, which at first glance may seem bi- 
ased toward WG. But as our simulation results show, 
other schemes can be more successful than WG depend- 
ing on the market behavior, even though wealth is the 
only measure of success in our model. Trend reversals 
and lingering influence of initial market trends are the 
main reasons why the performance of WG is impaired. 
WG agents are especially adept at building up positions 
optimal for the given price pattern, which means that 
a large portion of their wealth comes from their assets. 
Pattern changes can be too rapid for WG agents to fol- 
low up by moving to a new optimal position, and in such 
cases the agents are simply at the mercy of the market's 
whim. 

Still, we cannot ignore the advantages of WG. Its main 
strength lies in its versatility, as shown by tests using 
generated price data. WG agents learns from the his- 
tory, so they eventually adapt to any market trends and 



make up for the losses from initial "mistakes." Thus, we 
can justify using WG to model the behavior of traders 
in financial markets, whenever there is some degree of 
predictability and persistence in price patterns. 
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